Proceedings Template - WORD

نویسندگان

  • Rong Jin
  • Alex G. Hauptmann
  • Jamie Callan
چکیده

In this paper, we explored how to use meta-data information in information retrieval task. We presented a new language model that is able to take advantage of the category information for documents to improve the retrieval accuracy. We compared the new language model with the traditional language model over the TREC4 dataset where the collection information for documents is obtained using the k-means clustering method. The new language model outperforms the traditional language model, which verifies our statement.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Proceedings Template - WORD

Path loss and delay profile models for ITS applications based on the measured data at 700MHz band are presented.

متن کامل

Proceedings Template - WORD

Preserving privacy while publishing social network data has become a serious issue with the rapid growth of Social Networks. In this work, we propose a perturbation based approach for privacy preserving publication of social network graphs and evaluate the utility aspect of our proposed method using real world dataset.

متن کامل

Proceedings Template - WORD

This poster presents a computational analysis of conceptual metaphors in a community of political blogs. Like sentiment analysis or opinion extraction, computational metaphor identification can provide a means of understanding the particular framings or conceptualizations used in a community. This poster includes an overview of the implementation and a summary of results.

متن کامل

POC-NLW Template for Chinese Word Segmentation

In this paper, a language tagging template named POC-NLW (position of a character within an n-length word) is presented. Based on this template, a twostage statistical model for Chinese word segmentation is constructed. In this method, the basic word segmentation is based on n-gram language model, and a Hidden Markov tagger based on the POC-NLW template is used to implement the out-of-vocabular...

متن کامل

CASRA+: A Colloquial Arabic Speech Recognition Application

The research proposed here was for an Arabic speech recognition application, concentrating on the Lebanese dialect. The system starts by sampling the speech, which was the process of transforming the sound from analog to digital and then extracts the features by using the Mel-Frequency Cepstral Coefficients (MFCC). The extracted features are then compared with the system's stored model; in this...

متن کامل

Proceedings Template - WORD

Most of the approaches for dealing with uncertainty in the Semantic Web rely on the principle that this uncertainty is already asserted. In this paper, we propose a new approach to learn and reason about uncertainty in the Semantic Web. Using instance data, we learn the uncertainty of an OWL ontology, and use that information to perform probabilistic reasoning on it. For this purpose, we use Ma...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002